Harvesting and Organizing Knowledge from the Web
Identifieur interne : 000857 ( Main/Exploration ); précédent : 000856; suivant : 000858Harvesting and Organizing Knowledge from the Web
Auteurs : Gerhard Weikum [Allemagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2007.
English descriptors
- Teeft :
- Abstract information organization, Adbis, Agichtein, Algorithmic, Annotating, Artif, Auer, Avor, Banko, Berlin heidelberg, Best example, Broadhead, Cafarella, Certain types, Computationally, Conceptnet, Context awareness, Data management issues, Database, December, Downey, Eld, Elusive goal, Enormous progress, Entity search, Eswc, Etzioni, Experimental study, Explicit entities, Explicit knowledge sources, Explicit structure, Faceted, Faceted search, Fellbaum, Folksonomies, Geneontology, Gerhard weikum institute, Gloomy picture, Glorious form, Great opportunties, Harvest knowledge, Heidelberg, Human contributions, Human supervision, Hyperlinked, Hyperlinked text, Ieee, Ieee data engineering bulletin, Ifrim, Ijcai, Informatics, Informatics saarbruecken, Information extraction, Innsbruck, Intell, Interesting asset, Interesting research themes, Ioannidis, June, Kasneci, Knowledge bases, Knowledge management, Knowledge repositories, Knowledge sources, Koudas, Large extent, Leipzig, Link structure, Lncs, Major advances, More knowledge, Multilingual, Multilingual thesauri, Music bands, Natural language processing, Novikov, Ontological concepts, Ontology, Open information extraction, Opencyc, Opportunties, Other sources, Popescu, Rachev, Recent years, Relation patterns, Rich knowledge repositories, Rigorous representations, Rocket science, Saarbruecken, Sarawagi, Scalable, Scalable information extraction, Semantic knowledge, Semantics, Shaked, Similar sources, Snomed, Social networks, Soderland, Special issue, Springer, Staab, Statistical analysis, Strong proliferation, Strong trends, Studer, Such issues, Suchanek, Suciu, Sumo, Synergy, Taxonomy, Technology entity recognition, Terminological, Terminological taxonomies, Text sources, Thematic categories, Thesaurus, Topic recognition, Tutorial, Tutorial slides, Umls, Unsupervised, Unsupervised extraction, Vertical search, Weikum, Wiki, Wiki content, Wikipedia, Wordnet, Yago, Yates.
Abstract
Abstract: Information organization and search on the Web is gaining structure and context awareness and more semantic flavor, for example, in the forms of faceted search, vertical search, entity search, and Deep-Web search. I envision another big leap forward by automatically harvesting and organizing knowledge from the Web, represented in terms of explicit entities and relations as well as ontological concepts. This will be made possible by the confluence of three strong trends: 1) rich Semantic-Web-style knowledge repositories like ontologies and taxonomies, 2) large-scale information extraction from high-quality text sources such as Wikipedia, and 3) social tagging in the spirit of Web 2.0. I refer to the three directions as Semantic Web, Statistical Web, and Social Web (at the risk of some oversimplification), and I briefly characterize each of them.
Url:
DOI: 10.1007/978-3-540-75185-4_2
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000A93
- to stream Istex, to step Curation: 000A37
- to stream Istex, to step Checkpoint: 000671
- to stream Main, to step Merge: 000857
- to stream Main, to step Curation: 000857
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Harvesting and Organizing Knowledge from the Web</title>
<author><name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:674E11A735F9A5B51D1E8967050168105885CA1B</idno>
<date when="2007" year="2007">2007</date>
<idno type="doi">10.1007/978-3-540-75185-4_2</idno>
<idno type="url">https://api.istex.fr/document/674E11A735F9A5B51D1E8967050168105885CA1B/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000A93</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000A93</idno>
<idno type="wicri:Area/Istex/Curation">000A37</idno>
<idno type="wicri:Area/Istex/Checkpoint">000671</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000671</idno>
<idno type="wicri:doubleKey">0302-9743:2007:Weikum G:harvesting:and:organizing</idno>
<idno type="wicri:Area/Main/Merge">000857</idno>
<idno type="wicri:Area/Main/Curation">000857</idno>
<idno type="wicri:Area/Main/Exploration">000857</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Harvesting and Organizing Knowledge from the Web</title>
<author><name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
<affiliation wicri:level="1"><country xml:lang="fr">Allemagne</country>
<wicri:regionArea>Max-Planck Institute for Informatics, Saarbruecken</wicri:regionArea>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
<wicri:noRegion>Saarbruecken</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Allemagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2007</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="Teeft" xml:lang="en"><term>Abstract information organization</term>
<term>Adbis</term>
<term>Agichtein</term>
<term>Algorithmic</term>
<term>Annotating</term>
<term>Artif</term>
<term>Auer</term>
<term>Avor</term>
<term>Banko</term>
<term>Berlin heidelberg</term>
<term>Best example</term>
<term>Broadhead</term>
<term>Cafarella</term>
<term>Certain types</term>
<term>Computationally</term>
<term>Conceptnet</term>
<term>Context awareness</term>
<term>Data management issues</term>
<term>Database</term>
<term>December</term>
<term>Downey</term>
<term>Eld</term>
<term>Elusive goal</term>
<term>Enormous progress</term>
<term>Entity search</term>
<term>Eswc</term>
<term>Etzioni</term>
<term>Experimental study</term>
<term>Explicit entities</term>
<term>Explicit knowledge sources</term>
<term>Explicit structure</term>
<term>Faceted</term>
<term>Faceted search</term>
<term>Fellbaum</term>
<term>Folksonomies</term>
<term>Geneontology</term>
<term>Gerhard weikum institute</term>
<term>Gloomy picture</term>
<term>Glorious form</term>
<term>Great opportunties</term>
<term>Harvest knowledge</term>
<term>Heidelberg</term>
<term>Human contributions</term>
<term>Human supervision</term>
<term>Hyperlinked</term>
<term>Hyperlinked text</term>
<term>Ieee</term>
<term>Ieee data engineering bulletin</term>
<term>Ifrim</term>
<term>Ijcai</term>
<term>Informatics</term>
<term>Informatics saarbruecken</term>
<term>Information extraction</term>
<term>Innsbruck</term>
<term>Intell</term>
<term>Interesting asset</term>
<term>Interesting research themes</term>
<term>Ioannidis</term>
<term>June</term>
<term>Kasneci</term>
<term>Knowledge bases</term>
<term>Knowledge management</term>
<term>Knowledge repositories</term>
<term>Knowledge sources</term>
<term>Koudas</term>
<term>Large extent</term>
<term>Leipzig</term>
<term>Link structure</term>
<term>Lncs</term>
<term>Major advances</term>
<term>More knowledge</term>
<term>Multilingual</term>
<term>Multilingual thesauri</term>
<term>Music bands</term>
<term>Natural language processing</term>
<term>Novikov</term>
<term>Ontological concepts</term>
<term>Ontology</term>
<term>Open information extraction</term>
<term>Opencyc</term>
<term>Opportunties</term>
<term>Other sources</term>
<term>Popescu</term>
<term>Rachev</term>
<term>Recent years</term>
<term>Relation patterns</term>
<term>Rich knowledge repositories</term>
<term>Rigorous representations</term>
<term>Rocket science</term>
<term>Saarbruecken</term>
<term>Sarawagi</term>
<term>Scalable</term>
<term>Scalable information extraction</term>
<term>Semantic knowledge</term>
<term>Semantics</term>
<term>Shaked</term>
<term>Similar sources</term>
<term>Snomed</term>
<term>Social networks</term>
<term>Soderland</term>
<term>Special issue</term>
<term>Springer</term>
<term>Staab</term>
<term>Statistical analysis</term>
<term>Strong proliferation</term>
<term>Strong trends</term>
<term>Studer</term>
<term>Such issues</term>
<term>Suchanek</term>
<term>Suciu</term>
<term>Sumo</term>
<term>Synergy</term>
<term>Taxonomy</term>
<term>Technology entity recognition</term>
<term>Terminological</term>
<term>Terminological taxonomies</term>
<term>Text sources</term>
<term>Thematic categories</term>
<term>Thesaurus</term>
<term>Topic recognition</term>
<term>Tutorial</term>
<term>Tutorial slides</term>
<term>Umls</term>
<term>Unsupervised</term>
<term>Unsupervised extraction</term>
<term>Vertical search</term>
<term>Weikum</term>
<term>Wiki</term>
<term>Wiki content</term>
<term>Wikipedia</term>
<term>Wordnet</term>
<term>Yago</term>
<term>Yates</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Information organization and search on the Web is gaining structure and context awareness and more semantic flavor, for example, in the forms of faceted search, vertical search, entity search, and Deep-Web search. I envision another big leap forward by automatically harvesting and organizing knowledge from the Web, represented in terms of explicit entities and relations as well as ontological concepts. This will be made possible by the confluence of three strong trends: 1) rich Semantic-Web-style knowledge repositories like ontologies and taxonomies, 2) large-scale information extraction from high-quality text sources such as Wikipedia, and 3) social tagging in the spirit of Web 2.0. I refer to the three directions as Semantic Web, Statistical Web, and Social Web (at the risk of some oversimplification), and I briefly characterize each of them.</div>
</front>
</TEI>
<affiliations><list><country><li>Allemagne</li>
</country>
</list>
<tree><country name="Allemagne"><noRegion><name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
</noRegion>
<name sortKey="Weikum, Gerhard" sort="Weikum, Gerhard" uniqKey="Weikum G" first="Gerhard" last="Weikum">Gerhard Weikum</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Sarre/explor/MusicSarreV3/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000857 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000857 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Sarre |area= MusicSarreV3 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:674E11A735F9A5B51D1E8967050168105885CA1B |texte= Harvesting and Organizing Knowledge from the Web }}
This area was generated with Dilib version V0.6.33. |